Centre for Transforming Maintenance through Data Science, Curtin University
Published
February 25, 2025
This markdown contains the code to reproduce the main analysis in Chap. 2. In it I first show my developement on the method of Kaminskiy and Krivtsov (2005) for constructing a joint prior for the two Weibull parameters. I then simulate a dataset that emulates the observation process of the idler-frames and demonstrate the method for imputing the partialy observed lifetimes and missing truncation times acording the the method described in Sec. 2.2 of the Thesis. Lastly, I summarise the results of simulation experiments that explore the method under different values of the simulation parameters, these experiments were run of a virtual machine do to the amount of memory and compute needed.
# Function to sample from a truncated normal distributionrTruncNorm <-function(n, mean, sd, lb =NULL, ub =NULL) {# Check if lower bound is supplied & # calculate lower bound of uniform distif (!is.null(lb)) { lb_unif <-pnorm((lb - mean) / sd) } else { lb_unif <-0 }if (!is.null(ub)) { ub_unif <-pnorm((ub - mean) / sd) } else { ub_unif <-1 }# Sample from uniform distribution unif_rv <-runif(n, lb_unif, ub_unif)# Probability integral transformation norm_rv <- (qnorm(unif_rv) * sd) + meanreturn(norm_rv)}# function used to calculate weibull draws in joint priorfn <-function(x) {log(-log(1- x))}# function to plot the posterior CDFPlotPostCDF <-function(stan_fit, x_range =c(0, 5), res =0.05) { grid <-seq(x_range[1], x_range[2], 0.01) p_cdf <- stan_fit %>%as_draws_df() %>%select(beta, eta) %>%split(., seq(nrow(.))) %>%lapply(function(draw) { df_CDF <-data.frame(q = grid,p =pweibull(grid, draw$beta, draw$eta) )return(df_CDF) } ) %>%bind_rows() %>%ggplot() +stat_lineribbon(aes(x = q, y = p),.width =c(0.5, 0.9) ) +scale_fill_brewer() +geom_function(fun = pweibull,args =list(shape = beta, scale = eta),colour ="red" ) +xlab("t") +ylab("cdf") +theme_minimal() +theme(legend.position ="none")return(p_cdf)}
beta <-1.1eta <-1
1 Informative prior demonstration
Using the methods of Kaminskiy and Krivtsov (2005) we can encode a joint prior for the Weibull parameters by eliciting information about the CDF. This is much more intuitive than eliciting information about the parameters directly. The result is covariance in the joint prior that encodes where we are uncertain. This is usefull in cases where the likelihood is uninformative about some areas of the CDF but we do not want to impose a strong prior on the areas that the data are able to inform: for example, right censoring masks longer lifetimes and so we can inject information through the joint prior to inform the upper tail of the distribution while still allowing the data to update the lower tail. Figure 1 shows three different versions of a joint prior constructed around the same truth (\(\beta = 1.1\) and \(\eta = 1\)). Note that here, I am using a truncated-normal distribution to simulate draws of the CDF at \(t_1\) and \(t_2\) rather than beta distributions like in Kaminskiy and Krivtsov (2005).
Figure 1: Three different informative joint priors constructed using the method of Kaminskiy and Krivtsov (2005). The joint draws of the parameters are shown in the top row—(a), (b), and (c)—and the corresponding uncertainty in the CDF is shown in the bottom—(d), (e), and (f). (a) and (d) show a prior where information is elicited around the \(t_1 = 0.07\) and \(t_2 = 0.26\) (the 0.05 and 0.20 quantiles of the true distribution Weibull(1.1, 1), (b) and (e) show a prior where information is elicited around the \(t_1 = 0.46\) and \(t_2 = 1.04\) (the 0.35 and 0.65 quantiles), and (c) and (f) show a prior elicited at \(t_1 = 1.54\) and \(t_2 = 2.71\) (the 0.80 and 0.95 quantiles).
I implement the prior in the Stan models bellow, which means the draws fromt the prior are properly filteres through the likelihood. Section 4 demonstrates how this is not the case for the method presented in Kaminskiy and Krivtsov (2005).
2 Simulated data
In Sec. 2.5 of the main chapter, I demonstrate the methods for imputing left truncated lifetimes whith unkown exposure history and encoding information through an informative joint prior on a simulated dataset. I simulate data in a way that emulates the repeated replacement of set of units (say idler-frames) where the replacement time records are only available for a recent window of time. To do this, I sample many lifetimes (the exact number depends on the values of the simulation parameters) from a known Weibull distribution with shape \(\beta = 1.1\) and scale \(\eta = 1\) and assign them equaly to ten units. I then take the cumulative value of the lifetimes of each unit so that the values are now the replacement times, rather than lifetimes. I then impose some window of observaiton with begining t_start and end t_end and any replacement records that are not at least partialy observed in this window are discarded. Table 1 shows the simulated dataset when t_start = 5 and t_end = 6 and Figure 2 plots the data.
Table 1: The simulated data from a Weibull distribution with beta = 1.1 and eta = 1.
unit
lifetime
failure_time
install_time
lifetime_id
int_censored
right_censored
install_time_obs
failure_time_obs
1
1.6554515
6.184863
4.529412
7
TRUE
TRUE
5.000000
6.000000
2
1.7090255
5.224012
3.514987
4
TRUE
FALSE
5.000000
5.224012
2
0.7681212
5.992133
5.224012
5
FALSE
FALSE
5.224012
5.992133
2
0.3140953
6.306229
5.992133
6
FALSE
TRUE
5.992133
6.000000
3
2.2831981
7.227092
4.943893
10
TRUE
TRUE
5.000000
6.000000
4
3.1079538
6.382651
3.274697
5
TRUE
TRUE
5.000000
6.000000
5
2.1444583
5.993422
3.848964
6
TRUE
FALSE
5.000000
5.993422
5
1.6915445
7.684967
5.993422
7
FALSE
TRUE
5.993422
6.000000
6
1.1210816
5.280433
4.159351
6
TRUE
FALSE
5.000000
5.280433
6
0.2149471
5.495380
5.280433
7
FALSE
FALSE
5.280433
5.495380
6
2.0838391
7.579219
5.495380
8
FALSE
TRUE
5.495380
6.000000
7
0.7130826
5.609107
4.896025
4
TRUE
FALSE
5.000000
5.609107
7
1.8861705
7.495278
5.609107
5
FALSE
TRUE
5.609107
6.000000
8
2.9895999
5.722490
2.732890
5
TRUE
FALSE
5.000000
5.722490
8
0.3860810
6.108571
5.722490
6
FALSE
TRUE
5.722490
6.000000
9
0.9822323
5.660214
4.677981
4
TRUE
FALSE
5.000000
5.660214
9
0.4619549
6.122169
5.660214
5
FALSE
TRUE
5.660214
6.000000
10
3.0724421
5.845491
2.773048
8
TRUE
FALSE
5.000000
5.845491
10
1.0326978
6.878188
5.845491
9
FALSE
TRUE
5.845491
6.000000
Figure 2: The simulated data from a Weibull distribution with beta = 1.1 and eta = 1.
Before moving on, I prepare the data to be parsed to Stan.
# split data up into O, RC, LT, and LT+RCsim_df_obs <- sim_df %>%filter(!(right_censored | int_censored))sim_df_rc <- sim_df %>%filter(right_censored &!int_censored)sim_df_lt <- sim_df %>%filter(!right_censored & int_censored)sim_df_lt_rc <- sim_df %>%filter(right_censored & int_censored)# Define the stan data when LT are fully observedstan_data_full <-list(N_obs =nrow(sim_df_obs),N_rc =nrow(sim_df_rc),N_lt =nrow(sim_df_lt),N_lt_rc =nrow(sim_df_lt_rc),y_obs = sim_df_obs$failure_time - sim_df_obs$install_time,y_rc = sim_df_rc$failure_time_obs - sim_df_rc$install_time,y_lt = sim_df_lt$failure_time - sim_df_lt$install_time,t_lt = sim_df_lt$install_time_obs - sim_df_lt$install_time,y_lt_rc = sim_df_lt_rc$failure_time_obs - sim_df_lt_rc$install_time,t_lt_rc = sim_df_lt_rc$install_time_obs - sim_df_lt_rc$install_time)# Define the stan data when LT are discardedstan_data_no_lt <-list(N_obs =nrow(sim_df_obs),N_rc =nrow(sim_df_rc),y_obs = sim_df_obs$failure_time - sim_df_obs$install_time,y_rc = sim_df_rc$failure_time_obs - sim_df_rc$install_time)# Define the stan data when LT are imputedstan_data_unkown_lt <-list(N_obs =nrow(sim_df_obs),N_rc =nrow(sim_df_rc),N_lt =nrow(sim_df_lt),N_lt_rc =nrow(sim_df_lt_rc),y_obs = sim_df_obs$failure_time - sim_df_obs$install_time,y_rc = sim_df_rc$failure_time_obs - sim_df_rc$install_time_obs,y_lt = sim_df_lt$failure_time_obs - sim_df_lt$install_time_obs,y_lt_rc = sim_df_lt_rc$failure_time_obs - sim_df_lt_rc$install_time_obs,t_start = t_start)
2.1 Stan models (weakly informative)
Here I fit three models to the data. The first is a model where the left-truncated lifetimes (see thesis chapter for details) are fully observed—the best case scenario. The second is model simply discareds the left-truncated lifetimes—this is the best option in the literature if the exposure history of the left-truncated samples is unknown. The third and last model is my proposed method to impute the partialy observed left-truncated samples with unkown exposure history. I fit all three models with the vaigue prior \[
\begin{align*}
\beta & \sim N^{+}(1.1, 1) \\
\eta & \sim N^{+}(1, 1). \\
\end{align*}
\]
stan_model_unknown_lt_rc <-stan_model(model_code ="data {int N_obs; # N fully observed livesint N_rc; # N right censored only livesint N_lt; # N left truncated only livesint N_lt_rc; # N right cens and left trunc livesarray[N_obs] real<lower=0> y_obs; # Fully observed lifetimesarray[N_rc] real<lower=0> y_rc; # Right censored lifetimesarray[N_lt] real<lower=0> y_lt; # Left trunc lifetimesarray[N_lt_rc] real<lower=0> y_lt_rc; # right cens and left trunc lifetimesreal<lower=0> t_start; # start of the observation window}transformed data{array[N_lt] real<lower=0> y_lt_upper; # The upper bound of the left trunc livesfor (m in 1:N_lt){ y_lt_upper[m] = y_lt[m] + t_start; # Upper bound = lower bound + start of observation}}parameters {real<lower= 0> beta; # weibull shapereal<lower= 0> eta; # weibull scalearray[N_rc] real<lower=y_rc> y_rc_hat; # imputed right censored valuesarray[N_lt] real<lower=y_lt, upper=y_lt_upper> y_lt_hat; # imputed left trunc valuesarray[N_lt_rc] real<lower=y_lt_rc> y_lt_rc_hat; # imputed left trunc and right cens valuesarray[N_lt_rc] real<lower=0, upper=1> t_lt_rc_st; # imputed left truncation times for left trunc and right cens values (standardised)}transformed parameters{array[N_lt] real t_lt; # imputed left trunc times for left trunc valuesarray[N_lt_rc] real<lower=0, upper=t_start> t_lt_rc_upper;array[N_lt_rc] real<lower=0, upper=t_lt_rc_upper> t_lt_rc; # imputed left trunc times for left trunc and right cens valuesfor (i in 1:N_lt) { t_lt[i] = y_lt_hat[i] - y_lt[i];}for (k in 1:N_lt_rc){ if ((y_lt_rc_hat[k] - y_lt_rc[k]) < t_start) t_lt_rc_upper[k] = y_lt_rc_hat[k] - y_lt_rc[k]; else t_lt_rc_upper[k] = t_start; t_lt_rc[k] = t_lt_rc_st[k] * t_lt_rc_upper[k];}}model{// Data model// fully observed lifetimesy_obs ~ weibull(beta, eta);// right censored lifetimesy_rc_hat ~ weibull(beta, eta);// left truncated lifetimesfor (i in 1:N_lt) { y_lt_hat[i] ~ weibull(beta, eta) T[t_lt[i], ]; }// left truncated and right censored lifetimesfor (j in 1:N_lt_rc) { y_lt_rc_hat[j] ~ weibull(beta, eta) T[t_lt_rc[j], ]; }// Prior modeleta ~ normal(1, 1);beta ~ normal(1.1, 1);t_lt_rc_st ~ uniform(0, 1);}")
2.1.1 Sampling
I draw \(2000\) sample from the posteriors of each model, using \(4\) chains with a burn in of \(500\) samples and no thinning. Figure 3 shows the posteiror draws of the parameters from the three models as well as the coresponding uncertainty in the Weibull CDF.
Figure 3: The draws from the joint posteriors conditioned on the simulated dataset when the left-truncated lifetimes are fully observed (a), discarded (b), or imputed (c) and a weakly informative prior is used. (d), (e), and (f) show the corresponding uncertainty around the CDF (in the form of the 0.5 and 0.8 uncertain intervals) that result from (a), (b), and (c), respectively. The true parameter values and CDF are shown in red.
2.2 Stan models (informative)
Now I fit the same three models but with an informative joint prior, eliciting information at \(t_1 = p_{0.80} = 1.54\) and \(t_2 = p_{0.95} = 2.71\). The prior is \[
\begin{align*}
\hat{F}_{t_1} \sim & \hbox{N}^{1}_{0}\left(0.8, 0.1\right) \\
\hat{F}_{t_2} \sim & \hbox{N}^{1}_{\hat{F}_{t_1}}\left(0.95, 0.05\right).
\end{align*}
\]
The prior is specified around the true model used to simulate the data. Now, I redefine the models with the informative joint prior.
stan_model_lt_r_imputed_inf <-stan_model(model_code ="functions {// function to simplify the calculation of eta and betareal fn(real tCDF) { return log(-log1m(tCDF));}}data {int N_obs;int N_rc;int N_lt;int N_lt_rc;vector<lower=0>[N_obs] y_obs;vector<lower=0>[N_rc] y_rc;vector<lower=0>[N_lt] y_lt;vector<lower=0>[N_lt] t_lt;vector<lower=0>[N_lt_rc] y_lt_rc;vector<lower=0>[N_lt_rc] t_lt_rc;// Define the priorreal t_1;real t_2;real t1_mean;real t1_var;real t2_mean;real t2_var;}parameters {real<lower = 0, upper = 1> t1CDF;real<lower = t1CDF, upper = 1> t2CDF;vector<lower = y_rc>[N_rc] y_rc_hat;vector<lower = y_lt_rc>[N_lt_rc] y_lt_rc_hat;}transformed parameters {real<lower = 0> beta;real<lower = 0> eta;// calculate Weibull paramaters based on the// draws from the CDF at t1 and t2.beta = (fn(t2CDF) - fn(t1CDF)) / log(t_2 / t_1);eta = exp(log(t_1) - (fn(t1CDF) / beta));}model{// Data model// fully observed lifetimesy_obs ~ weibull(beta, eta);// right censored lifetimesy_rc_hat ~ weibull(beta, eta);// left truncated lifetimesfor (i in 1:N_lt) { y_lt[i] ~ weibull(beta, eta) T[t_lt[i], ]; }// left truncated and right censored lifetimesfor (j in 1:N_lt_rc) { y_lt_rc_hat[j] ~ weibull(beta, eta) T[t_lt_rc[j], ]; }// Prior modelt1CDF ~ normal(t1_mean, t1_var);t2CDF ~ normal(t2_mean, t2_var);}")
stan_model_no_lt_inf <-stan_model(model_code ="functions {// function to simplify the calculation of eta and betareal fn(real tCDF) { return log(-log1m(tCDF));}}data {int N_obs;int N_rc;vector<lower=0>[N_obs] y_obs;vector<lower=0>[N_rc] y_rc;// Define the priorreal t_1;real t_2;real t1_mean;real t1_var;real t2_mean;real t2_var;}parameters {real<lower = 0, upper = 1> t1CDF;real<lower = t1CDF, upper = 1> t2CDF;}transformed parameters {real<lower = 0> beta;real<lower = 0> eta;// calculate Weibull paramaters based on the// draws from the CDF at t1 and t2.beta = (fn(t2CDF) - fn(t1CDF)) / log(t_2 / t_1);eta = exp(log(t_1) - (fn(t1CDF) / beta));}model{// Data model// fully observed lifetimestarget += weibull_lpdf(y_obs|beta, eta);// right censored lifetimestarget += weibull_lccdf(y_rc|beta, eta);// Prior modelt1CDF ~ normal(t1_mean, t1_var);t2CDF ~ normal(t2_mean, t2_var);}")
stan_model_unknown_lt_rc_inf <-stan_model(model_code ="functions {// function to simplify the calculation of eta and betareal fn(real tCDF) { return log(-log1m(tCDF));}}data {int N_obs; # N fully observed livesint N_rc; # N right censored only livesint N_lt; # N left truncated only livesint N_lt_rc; # N right cens and left trunc livesarray[N_obs] real<lower=0> y_obs; # Fully observed lifetimesarray[N_rc] real<lower=0> y_rc; # Right censored lifetimesarray[N_lt] real<lower=0> y_lt; # Left trunc lifetimesarray[N_lt_rc] real<lower=0> y_lt_rc; # right cens and left trunc lifetimesreal<lower=0> t_start; # start of the observation window// Define the priorreal t_1;real t_2;real t1_mean;real t1_var;real t2_mean;real t2_var;}transformed data{array[N_lt] real<lower=0> y_lt_upper; # The upper bound of the left trunc livesfor (m in 1:N_lt){ y_lt_upper[m] = y_lt[m] + t_start; # Upper bound = lower bound + start of observation}}parameters {real<lower = 0, upper = 1> t1CDF;real<lower = t1CDF, upper = 1> t2CDF;array[N_rc] real<lower=y_rc> y_rc_hat; # imputed right censored valuesarray[N_lt] real<lower=y_lt, upper=y_lt_upper> y_lt_hat; # imputed left trunc valuesarray[N_lt_rc] real<lower=y_lt_rc> y_lt_rc_hat; # imputed left trunc and right cens valuesarray[N_lt_rc] real<lower=0, upper=1> t_lt_rc_st; # imputed left truncation times for left trunc and right cens values (standardised)}transformed parameters{real<lower = 0> beta;real<lower = 0> eta;array[N_lt] real t_lt; # imputed left trunc times for left trunc valuesarray[N_lt_rc] real<lower=0, upper=t_start> t_lt_rc_upper;array[N_lt_rc] real<lower=0, upper=t_lt_rc_upper> t_lt_rc; # imputed left trunc times for left trunc and right cens values// calculate Weibull paramaters based on the// draws from the CDF at t1 and t2.beta = (fn(t2CDF) - fn(t1CDF)) / log(t_2 / t_1);eta = exp(log(t_1) - (fn(t1CDF) / beta));for (i in 1:N_lt) { t_lt[i] = y_lt_hat[i] - y_lt[i];}for (k in 1:N_lt_rc){ if ((y_lt_rc_hat[k] - y_lt_rc[k]) < t_start) t_lt_rc_upper[k] = y_lt_rc_hat[k] - y_lt_rc[k]; else t_lt_rc_upper[k] = t_start; t_lt_rc[k] = t_lt_rc_st[k] * t_lt_rc_upper[k];}}model{// Data model// fully observed lifetimesy_obs ~ weibull(beta, eta);// right censored lifetimesy_rc_hat ~ weibull(beta, eta);// left truncated lifetimesfor (i in 1:N_lt) { y_lt_hat[i] ~ weibull(beta, eta) T[t_lt[i], ]; }// left truncated and right censored lifetimesfor (j in 1:N_lt_rc) { y_lt_rc_hat[j] ~ weibull(beta, eta) T[t_lt_rc[j], ]; }// Prior modelt1CDF ~ normal(t1_mean, t1_var);t2CDF ~ normal(t2_mean, t2_var);t_lt_rc_st ~ uniform(0, 1);}")
2.2.1 Sampling
I once agian draw \(2000\) sample from the posteriors, using four chains with a burn in of \(500\) samples and no thinning. Figure 4 shows the new posteiror draws of the parameters from the three models as well as the coresponding uncertainty in the Weibull CDF.
Figure 4: The draws from the joint posteriors conditioned on the simulated dataset when the left-truncated lifetimes are fully observed (a), discarded (b), or imputed (c) and a weakly informative prior is used. (d), (e), and (f) show the corresponding uncertainty around the CDF (in the form of the 0.5 and 0.8 uncertain intervals) that result from (a), (b), and (c), respectively. The true parameter values and CDF are shown in red.
To demonstrate that my implementation of the method proposed by Kaminskiy and Krivtsov (2005) properly filters the prior through the likelihood, Figure 5 compares the prior and posterior of the CDF at the two elicitation times \(t_1\) and \(t_2\). In all three models, the prior belief about the CDF at both times has been updated in the posterior after conditioning on the data. This is not the case in the original method proposed by Kaminskiy and Krivtsov (2005), as I show in Section 4 of this document.
Figure 5: Comparison of the marginal prior and posterior for \(F(t_1)\) and \(F(t_2)\) when left-truncated lifetimes are fully observed (a), discarded (b), or imputed (c) to show how both the elicited distributions have been updated in the posterior.
3 Sim study
The simulation study was run on a vertual machine on the Nimbus cloud server provided by Pawsey. The Rscripts for the simulation studies for \(\beta = 0.8\), \(1.1\), and \(2\) are left-trunc-sim-study-0.8.R, left-trunc-sim-study-1.1.R, and left-trunc-sim-study-2.R, respectively. A detailed description of the simulation study is provided in Sec. 2.5 of the Thesis.
Figure 6: The simulation results when the shape parameter equals \(1.1\), summarised by the mean of the one hundred repetitions for each factor combination. The columns show the three levels of \(t_{end} - t_{start}\), and the rows show the levels of \(t_{start}\) and \(N\). The Bayesian \(p\)-values of shape and scale are reported on the horizontal and vertical axis, respectively, and the colours of the points show the treatment of the left-truncated samples, and the shape shows the prior.
Figure 7: The elppd\({}_{100}\) simulation results when the shape parameter equals \(1.1\) for each factor combination. The columns show the different levels of \(t_{end} - t_{start}\) and rows the different levels of \(t_{start}\). The number of units \(N\) is shown on the horizontal axis, and the average elppd\({}_{100}\) score is shown on the vertical axis. The colours of the points show the treatment of the left-truncated samples and the shape shows the prior.
Figure 8: The elppd\({}_{100}\) simulation results when the shape parameter equals \(1.1\) for each factor combination summarised by the mean of the one hundred repetitions.
Figure 9: The simulation results when the shape parameter equals \(0.8\), summarised by the mean of the one hundred repetitions for each factor combination. The columns show the three levels of \(t_{end} - t_{start}\), and the rows show the levels of \(t_{start}\) and \(N\). The Bayesian \(p\)-values of shape and scale are reported on the horizontal and vertical axis, respectively, and the colours of the points show the treatment of the left-truncated samples, and the shape shows the prior.
Figure 10: The elppd\({}_{100}\) simulation results when the shape parameter equals \(0.8\) for each factor combination. The columns show the different levels of \(t_{end} - t_{start}\) and rows the different levels of \(t_{start}\). The number of units \(N\) is shown on the horizontal axis, and the average elppd\({}_{100}\) score is shown on the vertical axis. The colours of the points show the treatment of the left-truncated samples and the shape shows the prior.
Figure 11: The elppd\({}_{100}\) simulation results when the shape parameter equals \(0.8\) for each factor combination summarised by the mean of the one hundred repetitions.
Figure 12: The simulation results when the shape parameter equals \(2.0\), summarised by the mean of the one hundred repetitions for each factor combination. The columns show the three levels of \(t_{end} - t_{start}\), and the rows show the levels of \(t_{start}\) and \(N\). The Bayesian \(p\)-values of shape and scale are reported on the horizontal and vertical axis, respectively, and the colours of the points show the treatment of the left-truncated samples, and the shape shows the prior.
Figure 13: The elppd\({}_{100}\) simulation results when the shape parameter equals \(2.0\) for each factor combination. The columns show the different levels of \(t_{end} - t_{start}\) and rows the different levels of \(t_{start}\). The number of units \(N\) is shown on the horizontal axis, and the average elppd\({}_{100}\) score is shown on the vertical axis. The colours of the points show the treatment of the left-truncated samples and the shape shows the prior.
Figure 14: The elppd\({}_{100}\) simulation results when the shape parameter equals \(2.0\) for each factor combination summarised by the mean of the one hundred repetitions.
4 Appendix - A
In this appendix I demonstrate how the joint posterior of \(\beta\) and \(\eta\) derived using the method originaly proposed in Kaminskiy and Krivtsov (2005) depends on the analysts choise of \(t_1\) or \(t_2\) when regenerating the posterior draws after updataing the distribution of the CDF at time \(t_3\). Figure 15 (a) shows the joint posterior when the distributions at \(t_3\) and \(t_2\) are used to generate the posterior draws and (b) shows the joint posterior when the distributions at \(t_3\) and \(t_1\) are used. The two joint distributions in Figure 15 are very clearly diferent.
CalcBetaParams <-function(m, v) { a = ((m^2- m^3) / v) - m b = (a / m) - areturn(list(shape1 = a, shape2 = b))}rbetaMeanSd <-function(N, m, v) { pars <-CalcBetaParams(m, v) random_samp <-rbeta(N, pars$shape1, pars$shape2)return(random_samp)}# Define elicitation and observation timest_1 <-qweibull(0.3, beta, eta)t_2 <-qweibull(0.7, beta, eta)t_3 <-qweibull(0.2, beta, eta)# Generate samples from the joint priorprior <-data.frame(F1 =rbetaMeanSd(10000, 0.3, 0.05),F2 =rbetaMeanSd(10000, 0.7, 0.02)) %>%filter(F2 > F1) %>%mutate(beta = (fn(F2) -fn(F1)) /log(t_2 / t_1),eta =exp(log(t_1) - (fn(F1) / beta)),F3_prior =pweibull(t_3, beta, eta) )# Calculate coresponding values of beta distribution at t_3F3_prior_params <-CalcBetaParams(m =mean(prior$F3_prior),v =var(prior$F3_prior))# Generate some binomial data from true data generateing processbinom_data <-rbinom(n =1,size =10,prob =0.2)# Calculate posterior of P at t_3shape1_post <- F3_prior_params$shape1 + binom_datashape2_post <- F3_prior_params$shape2 + (10- binom_data)# Generate samples from joint post using t_2post1 <-data.frame(F3_post =rbeta(10000, shape1_post, shape2_post),F2 =rbetaMeanSd(10000, 0.7, 0.02)) %>%filter(F2 > F3_post) %>%mutate(beta = (fn(F2) -fn(F3_post)) /log(t_2 / t_3),eta =exp(log(t_3) - (fn(F3_post) / beta)) )p1 <- post1 %>%ggplot(aes(x = beta, y = eta)) +geom_point() +ylim(0, 10) +xlim(0, 10) +xlab(expression(P[beta])) +ylab(expression(P[eta])) +theme_minimal()# Generate samples from joint post using t_1post2 <-data.frame(F3_post =rbeta(10000, shape1_post, shape2_post),F1 =rbetaMeanSd(10000, 0.3, 0.05)) %>%filter(F1 > F3_post) %>%mutate(beta = (fn(F1) -fn(F3_post)) /log(t_1 / t_3),eta =exp(log(t_3) - (fn(F3_post) / beta)) )p2 <- post2 %>%ggplot(aes(x = beta, y = eta)) +geom_point() +ylim(0, 10) +xlim(0, 10) +theme_minimal() +xlab(expression(P[beta])) +ylab(expression(P[eta])) +theme_minimal()# Generate samples f# Compare the two posteriorscowplot::plot_grid( p1, p2,nrow =1, ncol =2,labels =c("(a)", "(b)" ),label_fontfamily ="Times",label_face ="plain")
Figure 15: The joint posterior of shape and scale when (a) the distributions at \(t_3\) and \(t_2\) are used to generate the posterior draws and (b) shows the joint posterior when the distributions at \(t_3\) and \(t_1\) are used.
References
Kaminskiy, Mark, and Vasiliy Krivtsov. 2005. “A Simple Procedure for Bayesian Estimation of the Weibull Distribution.”IEEE Transactions on Reliability 54 (January): 612–16.